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Abstract 



Object orientation provides a flexible framework for the implementation of the 
convolution of arbitrary distributions of real-valued random variables. 

We discuss an algorithm which is based on the Discrete Fourier Transformation 
and its fast computability via the Fast Fourier Transformation. It directly applies to 
lattice-supported distributions. In the case of continuous distributions an additional 
discretization to a linear lattice is necessary and the resulting lattice-supported distri- 
butions are suitably smoothed after convolution. 

We compare our algorithm to other approaches aiming at a similar generality as to 
accuracy and speed. In situations where the exact results are known, several checks 
confirm a high accuracy of the proposed algorithm which is also illustrated at approx- 
imations of non-central ^^-distributions. 

By means of object orientation this default algorithm can be overloaded by more 
specific algorithms where possible, in particular where explicit convolution formulae 
are available. 

Our focus is on R package distr which includes an implementation of this ap- 
proach overloading operator for convolution; based on this convolution, we define 
a whole arithmetics of mathematical operations acting on distribution objects, com- 
prising, among others, operators +, — , *, /, and a. 

Keywords: probability distributions, FFT, convolution, random variables, S4 classes, S4 meth- 
ods. 



Convolution of (probability) distributions is a standard problem in statistics. For its im- 
plementation the Fast Fourier Transformation (FFT) has been common practice ever since 
the appearance of Cooley and Tukey (1965). 

Combined with an object oriented programming (OOP) approach, this technique gets even 
more attractive: We may use it as a default algorithm in situations where no better alter- 
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native is known, while in special cases as e.g. those of normal or Poisson random variables, 
where convolution reduces to transforming the corresponding parameters, a dispatching 
mechanism realizes this and replaces the general method by a particular (possibly exact) 
one. The user does not have to interfere with the dispatching mechanism, but is rather 
provided with one single function/binary operator for the task of convolution. 

We discuss this approach within the R project (cf. R Development Core Team (2011)) where 
it is implemented in package distr available on CRAN. Package distr provides classes for 
probability distributions within the S4 OOP-concept of R; see Ruckdeschel, Kohl, Stabla, 
and Camphausen (2006, 2010). 

In this context, convolution is the workhorse for setting up a whole arithmetics of math- 
ematical operations acting on distribution objects, comprising, among others, operators 
+, — , /, and A. In this arithmetics, we identify distributions with corresponding (inde- 
pendent) random variables: If XI and X2 are corresponding distribution variables, X1+X2 
will produce the distribution of the sum of respective (independent) random variables, i.e. 
their convolution. Technically, speaking in terms of programming, we have overloaded the 
operator "+" for univariate distributions. 

Convolution itself is computed according to the actual classes of the operands, with par- 
ticular (exact) methods for e.g. normal or Poisson distributions. 

R> libr ar y ( di str ) 
R> Nl ^ Norm (mean =1, sd = 2) 
R> N2 ^ Norm (mean = -2 , sd = 1) 
R> Nl + N2 

Distribution Object of Class: Norm 
mean: -1 

sd: 2.23606797749979 

In the default method distributions are discretized to lattice form and the Discrete Fourier 
Transformation (DFT) is applied. Thus, our general-purpose algorithm needs no assump- 
tions like Lebesgue densities. 

R> Ul ^ Unif(Min = 0, Max = 1) 

R> U3 ^ convpow(Ul, N = 3) 

R> plot (U3 , cex. inner = 1, 

+ inner = c("density", "cdf", "quantile function")) 

While all our applied techniques are not novel in themselves, and much of the infrastructure 
(FFT in particular) has already been available in R for long, the combination as present 
in our approach is unique. Neither in core R nor in any other contributed add-on pack- 
age available on the standard repositories, i.e.; CRAN, Bioconductor, or Rmetrics, there is 
a similarly general approach: We provide a " + " (aka convolution) operator applying to 
[almost] arbitrary univariate distributions, no matter whether discrete or continuous; more 
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Figure 1: Plot of 3-fold convolution of a Unif(0,l) object. 



specifically we cover every distribution that is representable as a convex combination of 
an absolutely continuous distribution and a discrete distribution. In addition, the return 
value of this "+" operator is again a distribution object, i.e.; consisting not only of either a 
cumulative distribution function (cdf) or a density /probability function, but automatically 
of all four constitutive functions, i.e.; cdf, density, quantile function, and random number 
generator. Accuracy of our default methods can be controlled through global options, see 
?distroptions. Just to illustrate our point, we take up the initial example and compute the 
1/3-quantile of the convolution AA(1,2) * unif(0, 1)*'^ * Poisson(l), as well as evaluate its 
density at the vector (0.5, 0.8) 

R> P ^ Pois ( lambda=l) 
R> D ^ Nl + U3 + P 
R> q(D)(l/3) 
[1] 2.490786 

R> d(D) (c(0 .5 ,0 .8)) 

[1] 0.07526675 0.08894269 

The approach is not restricted to academic purposes: the results are sufficiently accurate 
to be used in practice in many circumstances: Be it quite general compound distribution 
models as relevant in actuarial sciences, be it very flexible model fitting techniques as 
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described in detail in Kohl and Ruckdeschel (2010), or be it very general robustification 
techniques as in packages RobAStBase, RDptEst and specialized to Biostat applications in 
RobLoxBioc, compare Kohl (2005) and Kohl and Deigner (2010). 

When interest lies in multiple convolutions (of identical summand distributions) we provide 
a function convpow to quickly and reliably compute convolution powers; in particular sample 
size then is not an issue. Otherwise, i.e.; for non-identically distributed summands, you 
either have to appeal to asymptotics in some way or do it summation by summation. We 
can say though, that our approach works reliably to up to 40 (non-)iid summands. In each 
case, we automatically provide respective quantile functions which are of particular interest 
in actuarial sciences and risk management. 

Our paper is organized as follows: 

In Section 2, we discuss how an object oriented framework could enhance implementations 
of both probability distributions in general and convolution algorithms in particular. To 
this end, we sketch our implementation of distribution classes in R package distr. We 
also briefly discuss the dispatching decisions involved when a new object of a distribution 
class is generated by convolution. In Section 3, we present the general purpose FFT- 
Algorithm and some ramifications. Some forerunners in this direction and connections to 
other approaches are discussed in Section 4. In Section 5 we present checks for the accuracy 
and computational efficiency of our algorithm. At the end of this paper we provide some 
conclusions in Section 6 

2. OOP for probability distributions and convolution 
2.1. OOP for probability distributions 

There is a huge amount of software available providing functionality for the treatment of 
probability distributions. In this paper we will mainly focus on S, more specifically, on 
its Open Source implementation R, but of course the considerations also apply for other 
extensible packages like XploRe, Gauss, Simula, SAS or MATLAB. All these packages provide 
standard distributions, like normal, exponential, uniform, Poisson just to name a few. 
There are limitations, however: You can only use distributions which either are already 
implemented in the package or in some add-on library, or distributions for which you 
yourself have provided an implementation. Automatic generation of new distributions is 
left out in general. 

In many natural settings you want to formulate algorithms once for all distributions, so you 
should be able to treat the actual distribution, say D, as argument to some function. This 
requires particular data types for distributions. Going ahead in this direction, you may 
wish to formulate statements involving the expectation or variance of functions of random 
variables as you are used to in Mathematics; i.e., no matter if the expectation involves a 
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finite sum, a sum of infinite summands, or a (Lebesgue) integral. Tliis idea is particularly 
well-suited for OOP, as described in Booch (1995), with its paradigms "inheritance" and 
"method overloading". 

In the OOP concept, we could let a dispatching mechanism decide which method to choose 
at run-time. In particular, the result of such an algorithm may be a new distribution, as 
in our convolution case. 

In his JAVA MCMC-simulation package HYDRA, Warnes (2002) heads for a similar OOP 
approach. Under http: //statdistlib. sourcef orge .net/, the author provides a set of JAVA 
classes representing common statistical distributions, porting the C-code underlying the R 
implementation. But, quoting the author himself from the cited web-page, "[o]ther than 
grouping the PDF, CDF, etc into a single class for each distribution, the files don't (yet) 
make much use of 00 design." 

2.2. OOP in S: The S4-class concept 

In base R, OOP is realized in the S3-class concept as introduced in Chambers (1993a, b), 
and by its successor, the S4-class concept, as developed in Chambers (1998, 1999) and 
described in detail in Chambers (2008). We work with the S4-class concept. 
Using the terminology of Bengtsson (2003), this concept is intended to be FOOP (function- 
object-oriented programming) style, in contrast to COOP (class-object-oriented program- 
ming) style, which is the intended style in C++, for example. 

In COOP style, methods providing access to or manipulation of an object are part of the 
object, while in FOOP style, they are not, but rather belong to so-called generic functions 
which are abstract functions allowing for arguments of varying type/class. A dispatching 
mechanism then decides on run-time which method best fits the signature of the function, 
that is, the types/classes of (a certain subset of) its arguments. In C++, "overloaded func- 
tions" in the sense of Stroustrup (1987, Section 4.6.6) come next to this concept. 
FOOP style has some advantages for functions like "+" having a natural meaning for many 
operand types/classes as in our convolution case. It also helps collaborative programming, 
as not every programmer providing functionality for some class has to interfere into the 
original class definition. In addition, as S respectively, R is an interpreted language, a 
method incorporated in a S4-class definition would not simply be a pointer but rather the 
whole function definition and environment. Hence, the COOP-style paradigm in (standard) 
R entails arguable draw-backs and hence is not generally advisable within the S4-class sys- 
tem. Since R version 2.12.0, this has been overcome to some extent, however, with the 
introduction of reference classes. 

Since its introduction to R, the S4-class concept has allowed COOP style, that is, members 
(or slots in S4-lingo) have always been permitted to be functions, but we may say that 
use of functional slots in S4 is not standard, which may be judged against a thread on the 
R mailing list r-devel on http://tolstoy.newcastle.edu.aU/R/devel/04a/0185.html. 
Use of functional slots has been extensively used in Bengtsson's (2003) R.oo package where 
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the author circumvents the above-mentioned problems by a non-standard call-by-reference 
semantic. 

For our distribution classes, we, too, use the possibility for function-type members, albeit 
only in a very limited way, and not extending the standard S4 system in any respect. Still, 
others have suggested to rather follow the S4-generic way for slots r,d,p,q, which however, 
in our opinion, would lead to many class definitions [a new one generated at each call to 
the convolution operation] instead of only few class definitions as in our design. 

2.3. Implementation of distribution classes within the S4-class concept 

In S/R, any distribution is given through four functions: r, generating pseudo-random 
numbers according to that distribution, d, the density or probability function/counting 
density, p, the cdf, and q, the quantile function. This is also reflected in the naming 
convention [prefix] <name> where [prefix] stands for r, d, p, or q and <name> is the 
(abbreviated) name of the distribution. 

We call these functions constitutive as we regard them as integral part of a distribution 
object, and hence realize them as members (slots) of our distribution classes even though 
this causes some "code weight" for the corresponding objects. A real benefit of this approach 
is grouping of routines which represent one distribution instead of having separate functions 
rnorm, dnorm, pnorm, and qnorm which otherwise are only connected by gentleman's 
agreement / naming convention. 

Consistency may become an issue then, of course: We cannot exclude the possibility that 
someone (inadvertedly) puts together inadequate r, d, p, or q slots, manipulating the slots 
by assignments of the like a@b ^4. This is not the intended way to generate distribu- 
tion objects, though. We do have generating functions for this purpose, the return values 
of which are consistent; the same goes for automatically generated distributions arising 
as return values from arithmetic operations. In addition, we do provide a certain level 
of consistency, following Gentleman (2003) and providing corresponding accessor- and re- 
placement functions for each of the slots. We strongly discourage the use of the a-operator 
to modify or even access slots r, d, p, and q and explicitly have mentioned this in Ruck- 
deschel. Kohl, Stabla, and Camphausen (2011, section 9 and Example 13.7) at least since 
2005. 

Another justification for this approach can be given by considering convolution: Assume 
we would like to automatically generate the constitutive functions for the law of expressions 
like X+Y for objects X and Y of some distribution class. Following the FOOP paradigm 
the function cdf to compute the cdf would not be part of the class but some method of a 
corresponding generic function. Then, as the constitutive functions vary from distribution 
to distribution and the dispatching mechanism makes its decision which method to use 
for cdf based on the signature, we would have to derive a new method for cdf for every 
(new) distribution class and would in particular need a new class for every newly generated 
distribution. That is, very soon the dispatching mechanism would have to decide between 
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lots of different signatures. In contrast, when cdf is a member of a class, dispatching is not 
necessary and calculations are more efficient. This efficiency is not obtained by extracting 
the, say, cdf as a functional slot, instead of getting it from dispatch after a quick look-up in 
a hash table, but rather by the necessity to have a sufficiently general class for the return 
value of convolution of arbitrary distributions: As a rule, the convolution of two arbitrary 
distributions / and g will generate a new distribution f * g for which there has not been an 
implementation before. So in order to have access to f*g in FOOP manor, you either have 
to compute cdf or density or quantile function "on the fly" for each evaluation or you have 
to generate a new S4 class and a hash table to re- find the particular cdf of f*g when calling 
something like cdf (conv(f ,g) ) or you have to limit the class of admitted operands (argu- 
ments) of convO , such that the result object is again a member of a (possibly parametric) 
set of distribution functions. 

In fact, R package actuar, Dutang, Goulet, and Pigeon (2008), pursues the FOOP approach 
just sketched in their function aggregateDist. To escape the possible multitude of new 
distribution classes, the authors restrict themselves to particular probability distributions; 
i.e., the "the (a, b, 0) or (a, b, 1) families of distributions" (see cited reference for their 
definition). Doing so, they can offer alternatives to compute the convolution (see help to 
aggregateDist). 

Their approach and ours do combine well though: Our extension package distrEx even 
depends on package actuar, using some of the additional root distributions provided there; 
these distributions are implemented efficiently as sets of functions interfacing to C, and their 
names follow the above-mentioned [prefix] <name> paradigm. 

2.4. Convolution as a particular method in "distr" 

Contrary to the r, d, p, and q functions just discussed, the computation of convolutions 
ideally fits in the FOOP-setup where method dispatching works as follows: 
In the case that there are better algorithms or even exact convolution formulae for the 
given signature, as for independent variables distributed according to Bin(nj,p), i = 1,2, 
or Poisson(Aj) or A/'(/Uj, af) etc., the dispatching mechanism for S4-classes will realize that, 
will use the best matching existing "+"-method and will generate a new object of the cor- 
responding class. However, this case is exceptional. Hence, we do not have to dispatch 
among too many methods. 

As our object oriented framework allows to override the default procedure easily by more 
specialized algorithms by method dispatch, the focus of our default algorithm. Algo- 
rithm 3.4, is not to provide the most refined techniques to achieve high accuracy but 
rather to be applicable in a most general setting. This default algorithm is based on FFT 
and will be described in detail in the next section. It originally applies to distribution 
objects of class LatticeDistribution . A lattice distribution is a discrete distribution whose 
support is a lattice of the form ao + iw, uq £ M., w G \ {0} with i G No (or {0, 1, . . . , n}, 
n S N). In our implementation this class is a subclass of class DiscreteDistribution which 
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in addition to its respective mother class UnivariatcDistribution has an extra slot support, 
a numerical vector containing the support (if finite, and else a truncated version carrying 
more than 1 — e mass). Besides DiscrctcDistribution, class UnivariatcDistribution has sub- 
classes AbscontDistribution for absolutely continuous distributions, i.e. distributions with a 
(Lebesgue) density, and UnivarLcbDccDistribution for a distribution in Lebesgue decomposed 
form, i.e. a mixture of an absolutely continuous part and a discrete part. Such distributions 
e.g. arise from truncation operations, or when a discrete distribution (with point mass at 
{0}) is multiplied with a(n) (absolutely) continuous one. 

Our FFT-based algorithm starts with two lattice distributions with compatible lattices; 
i.e., we assume that the support of the resulting convolved distribution has length strictly 
smaller than the product of the lengths of the supports of the operands. For discrete 
distributions, we check whether they can be cast to lattice distributions with compatible 
lattices. If one operand is absolutely continuous, the other one discrete, we proceed by 
"direct computation". If both operands are absolutely continuous, as described in Algo- 
rithm 3.4, we first discretize them to lattice distributions with same width w. The cdfs Fi 
and i*2 used in this algorithm will be obtained from the corresponding p-slots. For objects 
of class UnivarLcbDccDistribution, we proceed component-wise. 

Slots p and d of the resulting new object are then filled by Algorithm 3.4, described in 
detail in the next section. More precisely we will use variants of this algorithm for the 
absolutely continuous and the discrete/lattice case, respectively. 

Slot r of the new object consists in simply simulating pairs of variables by means of the 
r slots of the convolutional summands and then summing these pairs. Slot q is ob- 
tained by numerical inversion: For a continuous approximation of the quantile function 
we evaluate the function in slot p on an x-grid, exchange x- and y-axis and interpolate 
linearly between the grid points, for discrete distributions D we start with the vector 
pvec ^ p(D)(support(D)) and search for the support-point belonging to the largest member 
of pvcc smaller than or equal to the argument of q. 

2.5. General arithmetics of distributions in "distr" 

An important consequence of our approach of implementing distributions as classes is that 
this enables us to implement a fairly complete and accurate arithmetics acting on distri- 
butions respectively on random variables with corresponding distributions. 
The first observation to be made is that the image distribution of afRne linear transfor- 
mations can be explicitly spelt out for each of the slots r, d, p, and q. Hence, if X and 
Y are both univariate distributions, we define X-Y to mean the convolution of X and -Y. 
For distributions with support contained in (0,oo), also multiplication is easy: as log and 
exp are strictly monotone and differentiable transformations, the respective image distri- 
butions may also be spelt out explicitly, for each of the slots r, d, p, and q, and the 
X*Y=exp(log(X)+log(Y) ) . Splitting up the support of a distribution into positive, neg- 
ative, and 0-part (where each of the intersections may be empty), and interpreting this 
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as a mixture of possibly three distinct distributions, we can also allow general M- valued 
distributions as factors in multiplications; the result can then possibly be a mixture of a 
Dirac distribution in and an absolutely continuous distribution. For division we note that 
for distributions with positive support, X/Y=exp(log(X) -log(Y) ) , and similar arguments 
also allow us to cover powers, i.e.; expressions like X'Y. As an example, let us see how the 
distribution X = N x P looks like if iV ~ J\f{0, 1) and P ~ Poisson(A): 

R> X ^ NormO * Pois ( lambda = 1) 
R> q(X) ( .25) 
[1] -0.3471003 

R> p(X) (1:3) 

[1] 0.8545304 0.9409595 0.9729868 

R> r(X) (5) 

[1] 0.0000000 1.8531446 1.2856129 0.2602214 0.0000000 

p{> plot(X, cex. inner = 1, to . draw . arg = c ( 1 , 2) , 
+ inner = c("cdf ", "quantile function")) 




3. General purpose FFT algorithm 

The main idea of our algorithm is to use DFT, which may be calculated very fast by 
FFT. Hence, we start with a brief introduction to DFT and its convolution property (cf. 
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Theorem 3.2) where we follow Lesson 8 of Gasquet and Witomski (1999). Afterwards, we 
describe the convolution of cdf 's/densities in Section 3.2. 

3.1. Discrete Fourier Transformation 

Let m € N and let {xn)nez be a sequence of complex numbers with period m; i.e., Xn+m = 
Xn for all n G Z. Then, the DFT of order m is, 

DFT^iC^^C", (xo,xi,...,x„_i) ^ (£o,xi,...,x„_i) (3.1) 

where 

^ m— 1 

xn = -y] u.^ = e-^-/'", i = (3.2) 

j=0 

We obtain the DFT {xn)nez of {xn)nez by the periodic extension Xn+m = Xn for all n G Z. 
DFTm is represented by a matrix il^ with entries ujm = 0,1, . . . ,m — 1) and inverse 
= l/mQm {^m the conjugate DFT^); i.e., DFT^ is linear and bijective. 

Remark 3.1. (a) Computing xq, xi, . . . , Xm-i directly from equation (3.2), requires (m — 
1)^ complex multiplications and m(m — 1) complex additions. But, FFT as introduced 
by Cooley and Tukey (1965), is of just order mlogm. It works best for the case m = 2^ 
{p G N); see Lesson 9 of Gasquet and Witomski (1999). In case m = 2^^, direct computation 
needs 1046529 multiplications and 1047552 additions, whereas FFT only requires 4097 
multiplications and 10240 additions; see also Table 9.1 (ibid.). 

(b) If {xn)n£i. is a sequence of real numbers, it is possible to reduce the cost of computation 
by half; cf. Section 8.3 of Gasquet and Witomski (1999). 

(c) FFT is available in R as function fft. 

For DFTs we have the following convolution theorem: 

Theorem 3.2. Let x = {xn)nez Sind y = {yn)n<^z be two sequences of complex numbers 
with period m and let x = {xn)nez Sind y = (yn)nez be the corresponding DFTs. Then, 
the circular convolution product of x and y is dehned as, 



X * y = 




and it holds, 

z = mxy with z = x*y (3-4) 

where xy = (£„y„)„ez- 

The proof is standard; see for instance Kohl (2005, Theorem C.1.2). This Theorem implies 
the following result for A^-fold convolution products. 
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Proposition 3.3. Let x = {xn)nez be a sequence of complex numbers with period m and 
let X = {xn)nez be the corresponding DPT. Then, it holds, 

7^ = m^~'^x^ Nen (3.5) 

The proof immediately follows from Theorem 3.2 by induction. 



3.2. Convolution algorithm 

DFT is formulated for discrete (equidistant) sequences of complex numbers, as which we 
may interpret the probability function of the following special integer lattice distributions 

m—l 

Fi{x) = ^ Pi,j \j,oo){x) i = l,2 (3.6) 

j=0 

with 

m— 1 

Pi,j > i = 0, 1, . . . , m - 1 Yl Pi,j = 1 (3-7) 

j=0 

where x G M and m = 2'^ (q £ N). We extend pij {i = 1,2, j = 0, . . . ,m — 1) to two 
sequences pi = (pi,n)nez of real numbers with period 2m via, 

Pi J = i = 1,2 j = m, . . . , 2m — 1 (zero padding) (3.8) 

and 

Pi,k+2m = Pi,k V/c e Z (3.9) 
Then, the convolution F of Fi and F2 is an integer lattice distribution given by 

2m- 1 2 m- 1 

F{x) = {Fi * F2)ix) = ^ VTj ly^oo) (2;) with ttj := ^ pi^kP2,j-k (3.10) 

j=0 k=0 

where in particular TT2m-i = 0. Hence, in view of Theorem 3.2, n = (vrn)„gz = Pi * P2 and 
we can compute vr using FFT and its inverse. This result forms the basis of Algorithm 3.4. 



As it stands. Algorithm 3.4 will be presented for the case of absolutely continuous distri- 
butions, but with slight and obvious modifications this algorithm works for quite general 
distributions; for more details see also Section 3.3. 

Algorithm 3.4. Assume two absolutely continuous distributions F\,F2 on M. 

Step 1: (Truncation) 

If the support of Fi {i = 1, 2) is unbounded or "too large", we define numbers Ai,Bi G M, 
for given e > 0, such that, 

F,((-oo,^,)) = I and ^^((5^, 00)) = | (3.11) 
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and set A = min{Ai, A2} and B = max{i?i, i?2}- If this is not the case, we define 
A := min{Ff ^(0),F2"^(0)} and B := max{Ff ^(1), F2"^(l)} where Fr^ {i = 1,2) are 
the quantile functions of Fj. 

Step 2: (Discretization on a real grid) 

Given m = 2"^ (g G N) and Fi {i = 1,2), we define the lattice distributions 

™-i B - A 

Gii^) ■= ^Pi,j\A+(j+o.5)h,oo){x) h= ^ (3.12) 

where 



m 

j=0 



Pij = Fi{[A + jh, A + {j + l)h]) (3.13) 

for j = 0, 1, . . . , m — 1. 

Step 3: (Transformation to an integer grid) 

Based on Gi {i = 1,2), we define the integer lattice distributions 

m— 1 

G,{x) := Y,Pi,j\j,oo){x) i = l,2 (3.14) 

and extend pij {i = 1, 2, j = 0, . . . , m — 1) to two sequences pi = {pi,n)nez of real 
numbers with period 2m via, 

Pij = i = l,2 j = m,...,2m — l (zero padding) (3.15) 

and 

Pi,k+2m = Pi,k Vfc G Z (3.16) 

Step 4: (Convolution by FFT on integer grid) 

We calculate G = Gi * G2 FFT and its inverse as given in (3.10); i.e., 

2m- 1 2m~l 

G{x)= J2 ^iI[i,oo)(a;) := ^ Pi,kP2,j-k (3.17) 

j=0 k=0 

where in particular 7r2m-i = 0. 

Step 5: (Back-transformation to real grid) 
Given G, we obtain G = Gi * G2 by, 

2m-2 

G{x) = ^ VrjI[2A+(j + 1.5)/i,oo)(3;) (3.18) 
j=0 

That is, we additionally use a continuity correction of h/2, which improves the accuracy 
of the results. 
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Step 6: (Smoothing) 

Next, we use interpolation of the values of G on {2A, 2A + 1.5/i, . . . ,2B- 0.5/i, 2B} by 
linear functions to get a continuous approximation of F = Fi*F2. We obtain a contin- 
uous approximation of the density / of F by multiplying {0, ttq, vri, . . . , 7T2m-2, 0} by h 
and interpolating these values on the grid {2A, 2A + h, . . . , 2B — h, 2B} (no continuity 
correction) using linear functions. 

Step 7: (Standardization) 

To make sure that the approximation is indeed a probability distribution, we stan- 
dardize and f"^ by F'^{[2A,2B]) and / f\x)dx, respectively, where f f^{x)dx may 
be calculated numerically exactly, since is a piecewise linear function. 

For some instructive examples like tiie computation of (an approximation to) the station- 
ary regressor distribution of an AR(1) process, together with corresponding R sources see 
Ruckdeschel et al. (2011). 

3.3. Ramifications and extensions of this algorithm 

Algorithm 3.4 for lattice distributions: Obviously, Algorithm 3.4 applies to lattice 
distributions Fi,F2 on M defined on the same grid. In this case the algorithm essentially 
reduces to steps 1-5 and 7. Moreover, the results are numerically exact if the lattice distri- 
butions have finite support; cf. Section 5. In this case the algorithm consists only of steps 
2-5. 

Specification of "too large": In step 1, a support is considered as "too large" if a uni- 
form grid with a reasonable step-length produces too many grid points. In the same sense, 
the loss of mass included in step 1 of Algorithm 3.4 is, to some extent, controllable and in 
many cases negligible. 

Richardson Extrapolation: A technique to enhance the accuracy of Algorithm 3.4 for 
given q is extrapolation. But, for this to work properly, we need additional smoothness 
conditions for the densities. We could take this into account by introducing a new subclass 
SmoothDistribution for distributions with sufficiently smooth densities and a correspond- 
ing new method for the operator "+"; see also Section 4.1. 

Exponential Tilting: As a wrap-around effect, summation modulo m (cf. equation (3.3)) 
induces an aliasing error. Especially for heavy-tailed distributions - again at the cost of 
additional smoothness conditions for the densities - Algorithm 3.4 can thus be improved 
by a suitable change of measure (exponential tilting). So one might conceive a further 
subclass HeavyTailedSmoothDistribution and overload "+" for objects of these classes 
using exponential tilting; see also Section 4.1. 

Modification for M-Estimators: In view of Proposition 3.3, Algorithm 3.4 may eas- 
ily be modified to compute an approximation of the exact finite-sample distribution of 
M estimates, compare Ruckdeschel and Kohl (2010). In the cited reference, we compare 
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the results obtainable with this modified algorithm to other approximations of the exact 
finite-sample distribution of M estimates, like the saddle point approximation and higher 
order asymptotics. 

4. Connections to other approaches 

4.1. Algorithms based on DFT 

A very similar algorithm was proposed by Bertram (1981) to numerically evaluate com- 
pound distributions in insurance mathematics where he assumes claim size distributions 
of lattice type. Numerical examples and comparisons to other methods can be found 
in Biihlmann (1984) and Feilmeier and Bertram (1987). 

A mathematical formulation of the corresponding algorithm is included in Griibel and 
Hermesmeier (1999). However, the main purpose of their article is the investigation of 
the aliasing error. In case of a claim size distribution of lattice type they obtain a simple 
general bound for this error and show that it can be eliminated by exponential tilting. But, 
even without the smoothness assumptions needed for exponential tilting, the aliasing error 
can also be made very small if we choose e in step 1 of Algorithm 3.4 small enough and q 
in step 2 large enough. Thus, in many cases this effect is negligible. 

Moreover, if one considers absolutely continuous probability distributions, an initial dis- 
cretization step is necessary; see Step 2 of Algorithm 3.4. The corresponding error is studied 
in Griibel and Hermesmeier (2000) and it is shown that this error, under certain smoothness 
conditions, can be reduced by an extrapolation technique (Richardson extrapolation). 

Efficient and precise algorithms based on FFT for the convolution of heavy-tailed distri- 
butions are considered in Schaller and Temnov (2008). 

In Embrechts, Griibel, and Pitts (1993) the authors describe how one can use FFT to 
determine various quantities of interest in risk theory and insurance mathematics including 
the computation of the total claim size distribution, the mean and the variance of the 
process and the probability of ruin. Moreover, using FFT it is also possible to find the 
stationary waiting time distribution for a given customer inter-arrival time distribution and 
a given service time distribution in the G/G/1 queueing model; see Griibel (1991). 

4.2. Other algorithms 

For continuous distributions, instead of starting with a discretization of the cdf right away, 
we could also use the actual characteristic functions, i.e. the Fourier transformations of the 
corresponding distributions which then get inverted by the usual Fourier inversion formulae, 
see e.g. Chung (1974, Sec. 6. 2). As coined by Th. Lumely in a posting to r-help on March 
29, 2007, this is in particular useful if there are closed form expressions for the characteristic 
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functions as for instance for linear combination of independent x^-distributions. 

On the other hand, inverting characteristic functions is not a cure-ah procedure either, as 
may be seen when considering convolution powers of the uniform distribution on [—1/2,1/2]: 
The corresponding characteristic functions are (sin(t/2)/t)" which does if naively inverted 
cause quite some numerical problems. A more comprehensive account of this approach can 
be found in Cavers (1978), Abate and Whitt (1992) and Abate and Whitt (1995). 

Similarity, but with a restricted application range due to integrability one could stay on 
the real line using Laplace transformations; see for instance Abate and Whitt (1992) and 
Abate and Whitt (1995). 

In actuarial science, recursive schemes to compute convolution powers, the so-called Panjer 
recursions, have been in use for a long time. As Temnov and Warnung (2008) show, these 
recursive methods are slower than FFT when a sufficient precision of the estimated quantile 
is needed. 



5. Accuracy and computational efficiency of our algorithm 

To assess the accuracy and computational efficiency of our algorithm, we present checks for 
n-fold convolution products where the exact results are known. In addition, we approximate 
probabilities of non-central x^-distributions. 

5.1. Accuracy 

We determine the precision of the convolution algorithm in terms of the total variation 
distance of the densities, 

dv{P, Q) = lI\p-q\diJ. = sup \P{B) - Q{B) I (5.1) 

where P,Q£ Mi(E) with dP = pdfi, dQ = qd^ for some cr-finite measure fi on (M,B) and 
the Kolmogorov distance of the cumulative distribution functions, 

d«(P,Q) = sup|P((-oo,t]) -Q((-oo,t])| (5.2) 

Obviously, d^ < dy as the supremum in case of the total variation distance is taken over 
more sets. In the sequel dS, and stand for the numerical approximations of d^ and 
Due to numerical inaccuracies we obtain (is > dS, in some cases. 

The first example treats Binomial distributions and shows that the convolution algorithm 
is very accurate for integer lattice distributions with finite support. 

Example 5.1. Assume F = Bin(/c,p) with A; G N and p S (0,1). Then, the n-fold 
convolution product is F*" = Bin(n/i;,p) (n G N). Let /„ and be the probability 
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functions of F*"" and F\ respectively. Then, we may determine dy and numerically 
exact by, 

^ nk 

dl{F,F^) = -J2\fn{j)-f\j)\ (5.3) 

j=0 

and 

di{F,F^)= max |F™([0,i])-FH[0,j])| (5.4) 

je{0,...,nfc} 

We obtain the results contained in Table 1 which show that Algorithm 3.4 is very accurate 
in case of binomial distributions, where the values of k and p are chosen arbitrarily. To get 
the corresponding results we use our R packages distr and distrEx. For example 

R> library ( distrEx ) 

R> distroptions (TruncQuantile = le-15) 

R> Bl ^ BinomCsize = 30, prob = 0.8) 

R> B2 <~ convpow(Bl, N = 10) 

R> Dl as(Bl, "LatticeDistribution ") 

R> D2 -S- convpow(Dl, N = 10) 

R> TotalVarDist (B2 , D2) 

total variation distance 
2.266428e-15 

R> KolmogorovDist (B2 , D2) 

Kolmogorov distance 
1.249001e-15 

where B2 is computed using the exact formula and D2 is the approximation via FFT. To 
increase accuracy we change the default value of option TmncQuantilc from le— 5 to le— 15. 



n 


k 


P 


d^ 


d^ 


2 


10 


0.5 


3.3e-16 


2.2e-16 


5 


20 


0.7 


1.7e-15 


9.6e-16 


10 


30 


0.8 


2.6e-15 


1. le-15 


100 


15 


0.2 


5.3e-15 


4.3e-15 


1000 


50 


0.4 


8.3e-13 


4.2e-13 



Table 1: Precision of the convolution of binomial distributions via FFT; see 
Example 5.1. 
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In case of the Poisson distribution the results of the convolution algorithm turn out to be 
very accurate, too. 

Example 5.2. We consider F = Pois (A) with A G (0, oo) where F*"' = Pois (nA) (n e N). 
Since the support of F is No, we use ^ = and B = F^^{1 — le— 15) in step 1 of 
Algorithm 3.4 and determine dy and cij^ numerically exact by. 



dUi^,i^^) = ^El/n(j)-/^(i)l (5.5) 
j=0 

and 

di{F,F^)= ^ max |F-([0,j])-F^([0,j])| (5.6) 
je{o,...,M} 

where M is the 1 — le— 15 quantile of F*". We obtain the results contained in Table 2 which 
demonstrate the high precision of the convolution algorithm in case of Poisson distributions 
where the parameter A is chosen arbitrarily. The results can be obtained via our R packages 
distr and distrEx analogously to the Binomial case. 

R> library ( distrEx ) 

R> distroptions (TruncQuantile = le-15) 
R> PI Pois (lambda = 15) 

R> P2 ^ convpow(Pl, N = 100) 
R> Dl ^ as(Pl, "LatticeDistribution ") 
R> D2 ^ convpow(Dl, N = 100) 
R> TotalVarDist (P2 , D2) 
total variation distcince 
1.616895e-13 

R> KolmogorovDist (P2 , D2) 

Kolmogorov distance 
8.85958e-14 



In the next two examples we consider the convolution of absolutely continuous distribu- 
tions. We determine the total variation distance i^'') by numerical integration using 
the R function integrate. To compute an approximation of the Kolmogorov distance, we 
evaluate d\{F,F^) on a grid obtained by the union of a deterministic grid of size le05 and 
two random grids consisting of le05 pseudo-random numbers of the considered distribu- 
tions. We first present the results for normal distributions. 

Example 5.3. Assume F = J\f{fi,a'^) with G M and a G (0,oo). Then it holds, 
F*" = J\f {n^^na"^) (n G N). Starting with AA(0, 1) and A and B as defined in step 1 of 
Algorithm 3.4 we obtain A = aA + n and B = aB + n in case oi M {n, a'^). That is, the grid 
transforms the same way as the normal distributions do. Thus, we expect the precision 
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n 


A 




d^ 


2 


0.1 


2.9e-16 


2.2e-16 


5 


10.0 


3.7e-15 


3.1e-15 


10 


7.5 


4.0e-15 


4.0e-15 


100 


15.0 


1.8e-13 


l.Oe-13 


1000 


50.0 


2.0e-ll 


l.Oe-11 



Table 2: Precision of the convolution of Poisson distributions via FFT; see 
Example 5.2. 



of the results to be independent of ^ and a. This is indeed confirmed by the numerical 
calculations; see Table 3. We therefore may consider /j, = and a = 1 for the study of the 
accuracy of the convolution algorithm subject to n E N, e > (step 1) and g S N (step 2). 
The results included in Table 4 show that the precision is almost independent of n. It 
mainly depends on q where the maximum accuracy, we can reach, is of order e. The results 
can be computed with our R packages distr and distrEx similarily to the Binomial and 
Poisson case. 

R> library ( distrEx ) 

R> distroptions (TruncQuantile = le-10) 

R> distroptions (DefaultNrFFTGridPointsExponent = 14) 
R> Nl Norm (mean =0, sd = 1) 
R> N2 convpow(Nl, N = 2) 

R> Dl ^ as(Nl, " AbscontDistribution ") 
R> D2 ^ convpow(Dl, N = 2) 
R> distroptions (TruncQuantile = le-15) 
R> TotalVarDist (N2 , D2 , rel.tol = le-10) 
total variation distance 
9.806121e-08 

R> KolmogorovDist (N2 , D2) 

Kolmogorov distance 
1.700898e-07 

Our last example treats the convolution of exponential distributions which leads to gamma 
distributions. 

Example 5.4. We consider F = Exp (A) = r(l,A) with A G (0,oo). Then it holds, 
p*n _ Y[n,X) (n G N). Analogously to the normal case (cf. Example 5.3), the grid 
transforms the same as the exponential distributions do; i.e. A = 1/\A and B = 1/XB. 
Thus, we expect the precision of the results to be independent of A. This is again confirmed 
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n 


£ 


Q 




a 


di 


d^ 








-10.0 


100.0 


1.2e-06 


2.1e-06 








-2.0 


5.0 


1.2e-06 


2.1e-06 


2 


le-08 


12 


0.0 


1.0 


1.2e-06 


2.1e-06 








1.0 


50.0 


1.2e-06 


2.1e-06 








100.0 


1000.0 


1.2e-06 


2.1e-06 



Table 3: Precision of the convolution of normal distributions via FFT is 
independent of the parameters n and a; see Example 5.3. 



n 


£ 


Q 


dl 


d^ 


2 


le-06 


8 


2.2e-04 


3.9e-04 


10 


1.3e-05 


2.3e-05 


12 


3.5e-06 


1.8e-06 


le-08 


10 


1.9e-05 


3.4e-05 


12 


1.2e-06 


2. le-06 


14 


8.5e-08 


1.2e-07 


le-10 


12 


1.6e-06 


2.7e-06 


14 


9.8e-08 


1.7e-07 


18 


5.2e-10 


5.3e-10 


5 


le-08 


12 


3.4e-06 


9.7e-04 


16 


6.6e-08 


6.1e-05 


10 


le-08 


12 


l.le-05 


l.le-05 


16 


6.3e-08 


3.5e-08 


50 


le-08 


12 


1.6e-04 


9.6e-05 


18 


l.Oe-07 


5.3e-08 



Table 4: Precision of the convolution of normal distributions via FFT; see 
Example 5.3. 



by our numerical computations; see Table 5. Next we study the dependence of the accuracy 
of Algorithm 3.4 on n G N, e > and q G N where we may choose A = 1.0. As in 
Example 5.3 the precision is almost independent of n. It mainly depends on q where the 
maximum accuracy, we can reach, is of order e; see Table 6. The results can be computed 
with our R packages distr and distrEx similarity to the previous cases. 

Il> library ( distrEx ) 

Il> distroptions (TruncQuantile = le-8) 

R> distroptions (DefaultNrFFTGridPointsExponent = 16) 



20 



General Purpose FFT Convolution Algorithm 



R> El <- ExpCrate = 1) 
R> E2 <- convpow(El, N = 5) 
R> Dl -;- as(El, " AbscontDistribution ") 
R> D2 -;- convpow(Dl, N = 5) 
R> distroptions (TruncQuantile = le-15) 
R> TotalVarDist (E2 , D2 , rel.tol = le-10) 
total variation distcince 
1.39883e-07 

R> KolmogorovDist (E2 , D2) 

Kolmogorov distance 
9.453503e-08 



n 


e 


q 


A 




di 








0.01 


5.6e-06 


4.0e-05 








0.5 


5.6e-06 


4.0e-05 


2 


le-08 


12 


1.0 


5.6e-06 


4.0e-05 








5.0 


5.6e-06 


4.0e-05 








10.0 


5.6e-06 


4.0e-05 



Table 5: Precision of the convolution of exponential distributions via FFT 
is independent of the parameter A; see Example 5.4. 



Remark 5.5. Example 5.4 reveals one minor flaw of Algorithm 3.4. The support of 
r(n, A) is [0, do) whereas the convolution algorithm is only very accurate in [2A + (n/2 + 
0.5)h, . . . ,2B — {n/2 + 0.5)h]. That is, for small n (n < 5) the Kolmogorov distance 
is F([0,2^ + (n/2 + 0.5)/i)) - F'^{[0,2A + (n/2 + 0.5)/i)). However, for bigger n this 
inaccuracy disappears as there is less and less mass in [0, 2A + (n/2 + 0.5)/i). Moreover, 
since (n/2 + 0.5)h is very small, this also causes the numerical inaccuracy of dv for small 
n and leads to d^ > d'i. 

Example 5.6. In this last example we show how our FFT approach can be used to compute 
probabilities for non-central x^-distributions where the exact values are difficult to obtain. 
Let X be a non-central distributed random variable with df degrees of freedom and non- 
centrality parameter ncp; i.e., X ~ Xdf (^^P)- Our goal is to approximate the cdf P{X < x) 
at X G (0, oo). In Table 7 we give exact values of Patnaik (1949) (Patnaik), approximations 
by Ittrich, Krause, and Richter (2000) (Ittrich), approximations by function pchisq of 
package stats (R) as well as the results of three FFT approaches (FFT1-FFT3). In the 
first case (FFTl) we approximate X by 

X^Zf + Zl + ... + zl with ~ l) (5.7) 
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n 


e 


Q 


dl 


dl 


2 


le-06 


8 


7.5e-04 


4.7e-03 


10 


4.7e-05 


3.4e-04 


12 


4.5e-06 


2.2e-05 


le-08 


10 


8.1e-05 


6.0e-04 


12 


5.6e-06 


4.0e-05 


16 


3.6e-08 


1.6e-07 


le-10 


12 


8.0e-06 


6.2e-05 


14 


5.1e-07 


3.9e-06 


20 


2.7e-10 


9.6e-10 


5 


le-08 


12 


2.7e-05 


2.8e-05 


16 


1.4e-07 


9.5e-08 


10 


le-08 


12 


1.4e-04 


1.4e-04 


16 


6.2e-07 


5.3e-07 


50 


le-08 


12 


4.9e-03 


4.9e-03 


20 


3.8e-07 


3.8e-07 



Table 6: Precision of the convolution of exponential distributions via FFT; 
see Example 5.4. 



Secondly (FFT2) we use 

X^Zl + Zl + ... + Zl,_^ + Zl, (5.8) 

where Zi ^ N (0, 1) for i = 1, . . . , df — 1 and Z^f ~ N (-^/ncp, 1). Our third approximation 
(FFT3) reads 

X^Y + Z'^ (5.9) 

where Y ~ Xdf-i(O) central x^-distribution) and Z ^ N (^ncp, 1). 
For the FFT computations we used e = le— 08 and q = 18. All three FFT approaches 
give very good approximations. In particular, FFT3 yields results which have the same 
accuracy as pchisq and the approximation of Ittrich et al. (2000). 

p{> library (distr) 

R> distroptions (withgaps = FALSE) 

R> distroptions (TruncQuantile = le-8) 

R> distroptions (DefaultNrFFTGridPointsExponent = 18) 

R> dfO ^ 4 

R> ncpO <- 4 

R> xO ^ 1 . 765 

R> ## FFTl 
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R> Z <— NormCmean = sqrt ( ncpO /df ) ) 

R> Z2 4- Za2 

R> resl •;- convpow(Z2, N = dfO) 

R> ## FFT2 

R> Z ^ NormO 

R> Z2 ^ Za2 

R> X2 ^ convpow(Z2, N = dfO-1) 

R> Y2 ^ NormCmean = sqrt (ncpO ) ) a2 

R> res2 ^ X2 + Y2 

R> ## FFT3 

R> res3 ^ ChisqCdf = dfO-1) + Y2 

R> ## Results 

R> res ^ c (p (resl ) (xO) , p(res2)(x0), p(res3)(x0), 
+ pchisqCxO, df = df , ncp = ncpO)) 

R> names (res) 4- cC'FFTl", "FFT2", "FFT3", "R" ) 

R> res 

FFTl FFT2 FFT3 R 

0.04999865 0.04999924 0.04999936 0.04999937 



df 


ncp 


X 


Patnaik 


Ittrich 


R 


FFTl 


FFT2 


FFT3 


4 


4 


1.765 


0.0500 


0.0499994 




87 


2 












10.000 


0.7118 


0.7117928 




5 














17.309 


0.9500 


0.9499957 


















24.000 


0.9925 


0.9924604 


















10 


10.000 


0.3148 


0.3148207 




4 


6 












7 


1 


4.000 


0.1628 


0.1628330 




13 


15 












16.004 


0.9500 


0.9500015 


6 


4 




6 










16 


10.257 


0.0500 


0.0499942 




39 


39 












24.000 


0.5898 


0.5863368 




6 


4 












38.970 


0.9500 


0.9499992 


















12 


6 


24.000 


0.8187 


0.8173526 




10 


11 












18 


24.000 


0.2901 


0.2900495 




73 


71 












16 


8 


30.000 


0.7880 


0.7880015 




.... 79948 


.... 79994 








40.000 


0.9632 


0.9632255 




43 


1 












32 


30.000 


0.0609 


0.0628420 


1 


392 


09 












60.000 


0.8316 


0.8315635 




14 


23 












24 


24 


36.000 


0.1567 


0.1567111 




018 


023 












48.000 


0.5296 


0.5296284 




177 


174 












72.000 


0.9667 


0.9666954 




44 


41 













Table 7: Approximations of the cdf of non-central x^-distributions via FFT; 

see Example 5.6. [e = le— 08, q = 18, only the decimal places which 
are different to Ittrich are given] 
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5.2. Computational Efficiency 

To judge the computational efficiency of our algorithm, let us check it in a situation where 
the exact solution of the convolution is known, i.e.; at the 10-fold convolution of independent 
Xi(0) distributions. As timings are of course subject to hardware considerations we report 
relative timings, where as reference we use the implementation in R package actuar. As 
for general distributions, actuar already needs probabilities evaluated on a grid, we have 
to wrap the respective function aggregateDist into a function convActuar first, providing a 
respective discretization. 

R> gcO 

R> library ( actuar ) 

R> distroptions (TruncQuantile = le-5) 

R> distroptions (DefaultNrFFTGridPointsExponent = 12) 

R> ## convolution using package "actuar" 

R> convActuar function(N = 2, df = 1, ncp = 0, 



+ method = "lower")-[ 

+ Dl ^ ChisqCdf = df, ncp = ncp) 

+ lo getLow(Dl) 

+ up ^ getUp(Dl) 

+ dGPExp •;- getdistrOption ( "Def aultNrFFTGridPointsExponent ") 

+ m ^ max(dGPExp-f loor ( log (N) /log (2) ) , 5) 

+ M 2Am 

+ h (up-lo)/M 

+ probs <s— discretize (pchisq (x , df = df, ncp = ncp), 
+ from = lo , to = up, by = h, 

+ method = method) 

+ X ^ seqCfrom = N*lo + N/2=Kh, to = N*up-N/2=Kh, by = h) 

+ X <- c(x[l]-h, x[l], x+h) 

+ dx -S— aggregateDist (method = "convolution", 

+ model. freq = c(rep(0, N),l), 

+ model. sev = probs) 

+ list(d=dx, 

+ X = x) 



+ > 

No matter which of the methods implemented in function discretize of package actuar, 
i.e.; "rounding", "lower", or "upper", our algorithm compares fairly well as to both timings 
and accuracy: 

R> ## actuar (method = "rounding") 

R> system . time (resl convActuar (method = "rounding")) 

user system elapsed 
0.01 0.00 0.01 
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B> ## distr 

R> Dl •;- asCChisqO, " Abs contDi str ibut i on " ) 
R> system . time (D2 4- convpow(Dl = Dl , N = 2)) 

user system elapsed 
0.01 0.00 0.01 

R> ## max . distance between actuar and distr (method = "rounding 
R> summaryCabs (resl$d (knots (resl$d) ) - p(D2)( 
+ resl$x [c (-1 , -length (res l$x) ) ] ) ) ) 

Min. 1st Qu. Median Mean 3rd Qu. Max. 

2.005e-05 2.005e-05 2.018e-05 1.905e-04 4.636e-05 2.910e-03 

R> ## max. distance between actuar and distr (method = "upper") 
R> system . time (res2 4— convActuar (method = "upper")) 

user system elapsed 
0.02 0.00 0.02 

R> summary(abs (res2$d (knots (res2$d) ) - p(D2)( 
+ res2$x [c (-1 , -length ( res2$x ) ) ] ) ) ) 

Min. 1st Qu. Median Mean 3rd Qu. Max. 

3.600e-08 1.968e-05 2.000e-05 7.301e-05 2.000e-05 1.317e-03 

R> ## max. distance between actuar and distr (method = "lower") 
R> system . time (res3 4— convActuar (method = "lower")) 

user system elapsed 
0.02 0.00 0.02 

R> summary(abs (res3$d(knots (res3$d) ) - p ( D2 ) ( r e s3$x ) ) ) 

Min. 1st Qu. Median Mean 3rd Qu. Max. 

O.OOOe+00 2.000e-05 2.017e-05 2.006e-04 4.664e-05 4.855e-03 



To see the differences more clearly, let us repeat this 100 times. 



R> speedref 4— function ( expr . ref , rep. times = 100){ 

+ ref. time 4— sy st em . t ime ( for ( i in l:rep. times) 

+ res 4— eval ( expr . ref )) [1] 

+ names ( ref . time ) 4— NULL 

+ return ( list (res = res, ref. time = ref. time)) 

+ } 

R> speedcheck 4— function ( expr , ref. time, rep. times = 100)-[ 
+ r.time 4— system . t ime ( for ( i in l:rep. times) 

+ res 4— eval ( expr ))[ 1] /ref . t ime 

+ names ( r . t ime ) 4— NULL 

+ return ( list (res = res, r.time = r.time)) 

+ > 
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Comparing the relative timings we get the foUowing result (where timings are reported as 
percentages relative to convActuar): 

R> rep ^ 100 

R> refset -S— speedref (quote ( convActuar (N = 10, method = "lower")), 
+ rep. times = rep) 

R> rl ^ speedcheck (expr = quote ( convpow (Dl = Dl , N = 10)), 
+ ref.time = ref set$ref . time , rep. times = rep) 

R> r2 ^ speedcheck (expr = quote ( convpow (Dl = ChisqO, N = 10)), 
+ ref.time = ref set$ref . time , rep. times = rep) 

R> r3 -S— speedcheck ( expr = quote ( Chi sq ( df = 10)), 

+ ref.time = ref setSref . time , rep. times = rep) 

R> res f- refsetSres 

R> DIO ^ rl$res; Dex ^ r2$res ; Dcheck r3$res 

R> ## absolute time of convActuar (100 replicates ) : 

R> round ( ref set $ref . t ime , 2) 

[1] 3.87 

R> ## relative timings in percentages w.r.t. convActuar 

R> print (round (c ( " actuar "=1 , "FFT "=rl$r .time , 

+ " Chisq-Meth "=r2$r . time , 

+ " exact "=r3$r . time )* 100 , 2) ) 

actuar FFT Chisq-Meth exact 

100.00 40.31 25.58 7.24 

As to accuracy, our algorithm still is competitive: 

R> summary (abs (res$d (knots (res$d) ) [-C ( 1 : 8) ] - p ( D 10 ) ( res $x ) ) ) 

Min. 1st Qu. Median Mean 3rd Qu. Max. 

0.0000000 0.0001000 0.0001000 0.0001927 0.0001000 0.0019000 

R> summary (abs (res$d (knots (res$d) ) [-C ( 1 : 8) ] - p (Dex) (res$x) ) ) 

Min. 1st Qu. Median Mean 3rd Qu. Max. 

2.010e-08 l.OOOe-04 l.OOOe-04 2.455e-04 l.OOOe-04 3.120e-03 

R> summary (abs (p (Dex) (res$x)-p(D10)(res$x))) 

Min. 1st Qu. Median Mean 3rd Qu. Max. 

O.OOOe+00 O.OOOe+00 O.OOOe+00 6.111e-05 1.758e-07 1.253e-03 

Note that the computations with aggregateDist of actuar get considerably more expensive 
if you pass to finer discretizations, as we show in the following illustration which now cuts 
off lower and upper 10~^-quantiles (instead of 10~^ beforehand) and which uses 4 times 
as many discretization points (with only 30 replications) — again we report percentages 
relative to convActuar: 
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R> distroptions (TruncQuantile = le-6) 

R> distroptions (DefaultNrFFTGridPointsExponent = 14) 

R> rep 30 

R> ## absolute time of convActuar (30 replicates ) : 

R> round ( r ef set $ref . t ime , 2) 

[1] 82.2 

R> ## relative timings in percentages w.r.t. convActuar 

R> print (round (c ( " actuar " = 1 , "FFT " = rl$r .time , 

+ " Chisq-Meth "=r2$r . time , 

+ "exact" = r3$r.time)>Kl00,2)) 

actuar FFT Chisq-Meth exact 

100.00 1.76 0.35 0.13 

R> ## accuracy 

R> summaryCabs (res$d (knots (res$d) ) [-C ( 1 : 8) ] - p ( D 10 ) ( res $x ) ) ) 

Min. 1st Qu. Median Mean 3rd Qu. Max. 

O.OOOe+00 l.OOOe-05 l.OOOe-05 3.401e-05 l.OOOe-05 5.745e-04 

R> summary(abs (res$d(knots (res$d) ) [-C (1 : 8)] - p (Dex) (res$x) ) ) 

Min. 1st Qu. Median Mean 3rd Qu. Max. 

l.OOOe-10 l.OOOe-05 l.OOOe-05 4.172e-05 l.OOOe-05 7.835e-04 

R> summary (abs (p(Dex)(res$x)-p(D10)(res$x))) 

Min. 1st Qu. Median Mean 3rd Qu. Max. 

O.OOOe+00 O.OOOe+00 O.OOOe+00 8.643e-06 1.560e-09 2.149e-04 

6. Conclusion 

With our implementation of a general default convolution algorithm for distribtions in the 
object oriented framework of R we provide a flexible framework which combines scalable 
accuracy and reasonable computational efficiency. This framework lends itself for intro- 
ductory courses in statistics where students can easily sharpen their intuition about how 
convolution and other arithmetic operations work on distributions. It is however not lim- 
ited to merely educational purposes but can be fruitfully applied to many problems where 
one needs the exact distributions of convolutions, as arising e.g. in finite sample risk of M 
estimators (Ruckdeschel and Kohl 2010), actuarial sciences and risk management (Singh 
2010), linguistics (Schaden 2012), and Bingo premia calculations (Kroisandt and Ruckde- 
schel 2011). 
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